Group 18: FINDING GENE PATTERNS IN BREAST CANCER DATA

Group 18

Introduction:

Most common cancer in women worldwide

1 in 8 diagnosed

Many subtypes require a large collection of data

Question: Which genes are differentially expressed in different subtypes of cancer

General workflow

General wokflow

EXPLORATORY ANALYSIS AND TIDY:

Cleaning procedure

EXPLORATORY ANALYSIS AND TIDY:

Ratio between male and female

Age of femmale patients stratified by cancer status

Country of origin of female patients

Hitological type of patient samples

ANALYSIS:

Male-female ratio

We can see that many of the differentially expressed genes are on the left side of the volcano plot.

DESEQ Analysis: Top pathaways affected by DE genes

PCA Analysis:

Here is an analysis of PCA plots showing the scree and cumulative variance explained.

cp1

Cumulative

The high dimentionality required to explain 85% of the variablity of the data shows that cancer is a difficult task

PCA Analysis:

Overlapped clustering of patients

The clusters from patients with and without tumor overlap, meaning the PCA is not separating clearly the clusters.

Discussion: Biological insights

  • We can see that the DE genes in the data affect most importantly X pathways
  • THis makes/not makes sense with the literature as 1,2,3

Conclusion:

Next steps:

  • Clustering of DE genes
  • Identification of more subtypes with clustering